Drum Sound Detection in Polyphonic Music with Hidden Markov Models

نویسندگان

  • Jouni Paulus
  • Anssi Klapuri
چکیده

This paper proposes a method for transcribing drums from polyphonic music using a network of connected hidden Markov models (HMMs). The task is to detect the temporal locations of unpitched percussive sounds (such as bass drum or hi-hat) and recognise the instruments played. Contrary to many earlier methods, a separate sound event segmentation is not done, but connected HMMs are used to perform the segmentation and recognition jointly. Two ways of using HMMs are studied: modelling combinations of the target drums and a detector-like modelling of each target drum. Acoustic feature parametrisation is done with mel-frequency cepstral coefficients and their first-order temporal derivatives. The effect of lowering the feature dimensionality with principal component analysis and linear discriminant analysis is evaluated. Unsupervised acoustic model parameter adaptation with maximum likelihood linear regression is evaluated for compensating the differences between the training and target signals. The performance of the proposed method is evaluated on a publicly available data set containing signals with and without accompaniment, and compared with two reference methods. The results suggest that the transcription is possible using connected HMMs, and that using detector-like models for each target drum provides a better performance than modelling drum combinations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Drum Transcription from Polyphonic Music with Instrument-wise Hidden Markov Models

This paper describes a system for automatic transcription of drum instruments from polyphonic music signals. For each target drum instrument, a hidden Markov model (HMM) is created to describe the sound characteristics when the instrument is played. Also, a background model with only one state is created for each instrument to describe the sound when the target instrument is not played. The sig...

متن کامل

Automatic Chord Detection Using Harmonic Sound Emphasized Chroma from Musical Acoustic Signal

In this abstract we describe a method to automatically detect chord progression from musical acoustic signal. We suppress drum sounds because most popular music contains drum and such non-harmonic sound prevend to detect chord. We use Harmonic/Percussive sound separation tecnique, developed in our laboratory to get harmonic emphasized signal, then we use chroma vector and hidden Markov models t...

متن کامل

Signal Processing Methods for Drum Transcription and Music Structure Analysis Pre-examiner and Opponent

THIS thesis proposes signal processing methods for the analysis of musical audio on two time scales: drum transcription on a finer time scale and music structure analysis on the time scale of entire pieces. The former refers to the process of locating drum sounds in the input and recognising the instruments that were used to produce the sounds. The latter refers to the temporal segmentation of ...

متن کامل

Explicit Duration Hidden Markov Models for Multiple-Instrument Polyphonic Music Transcription

In this paper, a method for multiple-instrument automatic music transcription is proposed that models the temporal evolution and duration of tones. The proposed model supports the use of spectral templates per pitch and instrument which correspond to sound states such as attack, sustain, and decay. Pitch-wise explicit duration hidden Markov models (EDHMMs) are integrated into a convolutive prob...

متن کامل

Musical Acoustics and Speech Communication: Musical Pitch Tracking and Sound Source Separation Leading to Automatic Music Transcription II

This paper describes research aimed at building ‘‘active music listening interfaces’’ to demonstrate the importance of music understanding technologies, including sound source separation and F0 estimation, and the benefit they offer to end users. Active music listening is a way of listening to music through active interactions. Given polyphonic sound mixtures taken from available music recordin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2009  شماره 

صفحات  -

تاریخ انتشار 2009